Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server,cli: Support Init Command #16371

Merged
merged 3 commits into from
Jul 27, 2017

Conversation

adamgee
Copy link
Contributor

@adamgee adamgee commented Jun 6, 2017

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@bdarnell bdarnell changed the title Support Init Command server: Support Init Command Jun 7, 2017
@bdarnell bdarnell changed the title server: Support Init Command server,cli: Support Init Command Jun 7, 2017
@adamgee adamgee force-pushed the 5974_revisit_cluster_init branch 2 times, most recently from e528993 to 809b910 Compare June 9, 2017 18:55
@bdarnell
Copy link
Contributor

Reviewed 2 of 2 files at r1, 6 of 6 files at r7.
Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


pkg/cli/flags.go, line 277 at r7 (raw file):

	}

	varFlag(initCmd.Flags(), &serverCfg.JoinList, cliflags.Join)

Why? I think cockroach init should take --host like a client command, not --join like cockroach start.


pkg/cli/init.go, line 24 at r7 (raw file):

	"github.com/spf13/cobra"

	//	"github.com/cockroachdb/cockroach/pkg/server"

Delete before merging.


pkg/cli/init.go, line 47 at r7 (raw file):

	request := &serverpb.BootstrapRequest{}
	fmt.Println("Request:", request)
	log.Info(ctx, "Request: ", request)

We probably want to remove these prints and logs before merging.


pkg/server/init.go, line 33 at r7 (raw file):

	server       *Server
	bootstrapped chan struct{}
	mux          sync.Mutex

s/mux/mu/

If the mutex only protects waiting, we like to use this anonymous struct pattern:

mu struct {
    sync.Mutex
    waiting bool
}

But if it's held for the entire duration of Bootstrap, it might be better not to do that to highlight the fact that it protects more than just the variables it contains.


pkg/server/init.go, line 42 at r7 (raw file):

func (s *initServer) serve(ctx context.Context, ln net.Listener) {
	grpcServer := s.server.grpc

If we're sharing the grpcServer with the rest of the Server, how can we be confident that the other services haven't been registered yet? The initServer could also remain after the initialization phase, which I'd prefer to avoid.

Can we pass in an RPCContext and make a new grpc.Server instead?


pkg/server/init.go, line 51 at r7 (raw file):

func (s *initServer) awaitBootstrap() error {
	{

Remove these braces, they don't mean anything (in Go, braces define variable scopes but not defer scopes)


pkg/server/init.go, line 53 at r7 (raw file):

	{
		s.mux.Lock()
		s.waiting = true

We should probably set waiting to false when this method exits (in a deferred function), just as an extra sanity check.


pkg/server/server.go, line 542 at r7 (raw file):

	readyToServe := int32(0)
	initL := m.Match(func(_ io.Reader) bool {
		return atomic.LoadInt32(&readyToServe) == 0

Ah, this is a clever approach to the problem. I'd rather not expose the initServer at all if we're already bootstrapped, though. That would mean keeping serveOnMux and calling it in two places, both at the end where it was originally and just before awaitBootstrap.


pkg/server/server.go, line 599 at r7 (raw file):

	s.stopper.RunWorker(workersCtx, func(context.Context) {
		<-s.stopper.ShouldQuiesce()
		netutil.FatalIfUnexpected(anyL.Close()) // TODO: Do we need to also close the other listeners?

Yeah, it seems like we should close all of them or none of them here. I'm not sure why we only close this one.


pkg/server/server.go, line 792 at r7 (raw file):

	// Set the readyToServe bit which removes the initL from cmux.
	if !atomic.CompareAndSwapInt32(&readyToServe, 0, 1) {

This could be atomic.StoreInt32(&readyToServe, 1). There are no other conflicting changes that could be happening.


pkg/server/serverpb/init.proto, line 1 at r7 (raw file):

// Copyright 2016 The Cockroach Authors.

s/2016/2017/


Comments from Reviewable

@bdarnell
Copy link
Contributor

I'm taking over this PR; I've added some tests and done a little cleaning up. There's still a little more I want to do (mainly figure out why the acceptance tests fail if I run the bootstrap against a node other than zero), but I think it's pretty much ready to go.

@petermattis
Copy link
Collaborator

Review status: 1 of 16 files reviewed at latest revision, 14 unresolved discussions, some commit checks failed.


pkg/cli/init.go, line 30 at r10 (raw file):

	Use:   "init",
	Short: "TODO",
	Long:  "TODO",

Reminder to fill these in.


pkg/cli/init.go, line 44 at r10 (raw file):

	request := &serverpb.BootstrapRequest{}
	_, err = c.Bootstrap(ctx, request)
	if err != nil {

Maybe tighten this up a small bit:

if _, err := c.Bootstrap(ctx, &serverpb.BootstrapRequest{}); err != nil {

pkg/server/server.go, line 581 at r10 (raw file):

	// Inject an initialization listener that can optionally intercept
	// all connections.

that will intercept all connections while the cluster is initializing.


Comments from Reviewable

@bdarnell bdarnell force-pushed the 5974_revisit_cluster_init branch 5 times, most recently from dadba7b to ee3bf84 Compare July 22, 2017 03:12
@bdarnell
Copy link
Contributor

(mainly figure out why the acceptance tests fail if I run the bootstrap against a node other than zero)

The reason this was failing is that nodes don't start their gossip server until they're initialized. I've made an attempt to changes this, but it runs into difficulties at the RPC layer (need to stop and restart the rpc server to switch from init to regular mode, and make sure that all the long-lived gossip connections either stay connected or stop and restart cleanly) and inside gossip (the server side of gossip has some assumptions that the cluster has been initialized before the server starts)

So we can either
A) Live with the limitation that the init command must go to nodes that are join targets, instead of an arbitrary node
B) Fix the above issues
C) Introduce a second RPC in the init service to tell a join target to stop waiting for the init command

I'm leaning towards option A. @a-robinson, would that interfere with the use of the init command in kubernetes/swarm/etc deployments?

@petermattis
Copy link
Collaborator

Option A is likely the easiest, but it feels a bit unsatisfactory. I'm a little out of date with regards to the node initialization sequence. Can you expand what is going wrong with sending the init RPC to a node other than node zero.

@a-robinson
Copy link
Contributor

Can you clarify what you mean by "nodes that are join targets, instead of an arbitrary node"? Does this mean that your join configuration effectively needs to be a DAG where all nodes can reach the node that you send the init command to?

@bdarnell
Copy link
Contributor

Can you expand what is going wrong with sending the init RPC to a node other than node zero.

In the current implementation, a node is either serving the init method or the regular rpc interface, but not both at the same time. When node zero (the join target) is initialized, it switches to the regular rpc interface and the other nodes can join it. When another node is initialized, it tries to join node zero, but it can't because the gossip interface is not active yet. Option B is to enable the gossip interface earlier (and deal with the issues that arise because we have not previously had an active gossip server with an unknown cluster ID); option C is to handle this by extending the initialization rpc interface.

Does this mean that your join configuration effectively needs to be a DAG where all nodes can reach the node that you send the init command to?

Not just a DAG. In fact adding a cycle is one way to fix the problem. If for example all your nodes point to nodes 1, 2 and 3 as join targets, then you must init one of those three nodes. If your join flags describe a ring (node N joins node N-1 modulo num_nodes), you can init any node.

The status quo has similar constraints on the join graph: every node must be able to reach the node that was started without a join flag. The new system requires that every node be able to reach the init target. The question is just whether that undermines some of the flexibility we hoped to gain with the init command.

@a-robinson
Copy link
Contributor

a-robinson commented Jul 24, 2017

Not just a DAG. In fact adding a cycle is one way to fix the problem. If for example all your nodes point to nodes 1, 2 and 3 as join targets, then you must init one of those three nodes. If your join flags describe a ring (node N joins node N-1 modulo num_nodes), you can init any node.

The status quo has similar constraints on the join graph: every node must be able to reach the node that was started without a join flag. The new system requires that every node be able to reach the init target. The question is just whether that undermines some of the flexibility we hoped to gain with the init command.

Ah, of course, it needs to be fully connected, not acyclic (as in #13027).

Option A does feel a bit half-hearted, but it's just imposing a restriction that we could later open up to improve usability. This is how it'll effect each orchestration tool:

  • Kubernetes: It'll work fine, and in the process slightly simplify the config and eliminate one very unlikely edge case of node 1 creating a new cluster ID if it loses its disk and gets restarted while all the other processes are down.
  • Mesos: It'll work fine, but won't simplify anything (due to some of Mesos's more sophisticated options we're taking advantage of).
  • Swarm: It'll simplify our current deployment instructions, but won't work with a possible simplification of our insecure deployment (docs: docker swarm service unique service name #17133) due to the containers not having known names in advance. That's not a huge deal, though, given that that approach was only going to be practical for insecure deployments in the short term anyway.

I assume that for option C what you're proposing is that when a node gets initialized it will effectively forward on its new cluster ID via an RPC on the Init service to all targets it's trying to join? I actually like that quite a bit. It'd be fairly isolated from everything else. Option B, on the other hand, would be quite tricky to get right.

@bdarnell
Copy link
Contributor

I assume that for option C what you're proposing is that when a node gets initialized it will effectively forward on its new cluster ID via an RPC on the Init service to all targets it's trying to join?

Yes, that's the idea. It's a more isolated solution than option B, but there's still a little plumbing complexity to use the gossip bootstrap targets in this new context (and it gets tricky if the join target is a load balancer or DNS alias, although now that I think about it those cases would probably also have difficulty forming a fully connected graph).

@a-robinson
Copy link
Contributor

What happened to the goal from the RFC of verifying that the specified number nodes are connected? That would appear to not work without something a little more sophisticated (like option B).

@bdarnell
Copy link
Contributor

We decided to drop that from the RFC because it seemed like unnecessary complexity.

@a-robinson
Copy link
Contributor

That's what I thought, then I got confused by seeing it still in the summary of the RFC. I'll change it.

but there's still a little plumbing complexity to use the gossip bootstrap targets in this new context

Is that so? Would a node that needs to be initialized have any bootstrap targets other than what was specified on the command line?

Anyways, I take it this is ready to review? If so I'll take a look at it within the next day or so. I'd like to see option C get implemented if we have time, but it seems like it'll just be additive so discussing it shouldn't block reviewing option A if that's ready.

@bdarnell
Copy link
Contributor

Is that so? Would a node that needs to be initialized have any bootstrap targets other than what was specified on the command line?

Just the ones on the command line. It's just a little annoying to use them because we have to work outside the regular connection pool (which wants heartbeats to work, etc).

Anyway, yes, this is ready to review. We can implement option C later if we have time and/or it seems to be useful.

@a-robinson a-robinson self-requested a review July 26, 2017 15:32
@a-robinson
Copy link
Contributor

Nice, this turned out very clean!


Reviewed 1 of 2 files at r1, 1 of 7 files at r6, 2 of 8 files at r8, 3 of 8 files at r9, 4 of 9 files at r10, 6 of 6 files at r11.
Review status: all files reviewed at latest revision, 22 unresolved discussions, some commit checks failed.


pkg/cli/init.go, line 33 at r11 (raw file):

Perform one-time-only initialization of a CockroachDB cluster.

After starting one or more nodes with --join flags, run the init

The --join requirements needed for init to work should be included in the description here.


pkg/server/init.go, line 33 at r7 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

s/mux/mu/

If the mutex only protects waiting, we like to use this anonymous struct pattern:

mu struct {
    sync.Mutex
    waiting bool
}

But if it's held for the entire duration of Bootstrap, it might be better not to do that to highlight the fact that it protects more than just the variables it contains.

It looks like this never got changed, but it should


pkg/server/init.go, line 56 at r11 (raw file):

waiting: false

Not really needed/idiomatic


pkg/server/init.go, line 94 at r11 (raw file):

	s.mu.Lock()
	defer s.mu.Unlock()
	if !s.waiting {

Isn't there an inherent race between when we start serving and when we start awaiting bootstrap? It'd be safer to just initialize waiting to true in newInitServer and set it to false in awaitBootstrap.


pkg/server/init.go, line 95 at r11 (raw file):

	defer s.mu.Unlock()
	if !s.waiting {
		return nil, errors.New("Init not expecting Bootstrap")

Since this is going to be an error that new users may see when they're failing to set up their cluster properly, something with a little more explanation would be nice.


pkg/server/init.go, line 99 at r11 (raw file):

	if err := s.server.node.bootstrap(ctx, s.server.engines); err != nil {
		log.Error(ctx, "Node bootstrap failed: ", err)

s/Node/node for consistency with the rest of our log messages


pkg/server/node.go, line 333 at r11 (raw file):

func (n *Node) bootstrap(ctx context.Context, engines []engine.Engine) error {
	n.initialBoot = true
	// This node has no initialized stores and no way to connect to

This comment isn't really applicable anymore.


pkg/server/server.go, line 775 at r11 (raw file):

		log.Infof(ctx, "**** add additional nodes by specifying --join=%s", s.cfg.AdvertiseAddr)
	} else {
		log.Info(ctx, "No stores bootstrapped and --join flag specified, starting Init Server.")

s/No/no


pkg/server/server.go, line 804 at r11 (raw file):

	if err != nil {
		log.Error(ctx, "Node.start error: ", err)

Did you mean to leave this in? The error should already be getting printed out by the caller, shouldn't it?


Comments from Reviewable

@bdarnell bdarnell force-pushed the 5974_revisit_cluster_init branch from ee3bf84 to 6dce56a Compare July 26, 2017 21:27
@bdarnell
Copy link
Contributor

Review status: 13 of 16 files reviewed at latest revision, 22 unresolved discussions, some commit checks failed.


pkg/server/init.go, line 33 at r7 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

It looks like this never got changed, but it should

Done


pkg/server/init.go, line 56 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

waiting: false

Not really needed/idiomatic

Done


pkg/server/init.go, line 94 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Isn't there an inherent race between when we start serving and when we start awaiting bootstrap? It'd be safer to just initialize waiting to true in newInitServer and set it to false in awaitBootstrap.

Renamed the variable to awaitDone so it can keep a zero default.


pkg/server/init.go, line 95 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Since this is going to be an error that new users may see when they're failing to set up their cluster properly, something with a little more explanation would be nice.

Changed to "bootstrap called after initialization complete". Better suggestions welcome.


pkg/server/init.go, line 99 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/Node/node for consistency with the rest of our log messages

Done


pkg/server/node.go, line 333 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

This comment isn't really applicable anymore.

Removed


pkg/server/server.go, line 581 at r10 (raw file):

Previously, petermattis (Peter Mattis) wrote…

that will intercept all connections while the cluster is initializing.

Done


pkg/server/server.go, line 775 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/No/no

Done


pkg/server/server.go, line 804 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Did you mean to leave this in? The error should already be getting printed out by the caller, shouldn't it?

Removed


Comments from Reviewable

@a-robinson
Copy link
Contributor

:lgtm:


Reviewed 3 of 3 files at r12.
Review status: all files reviewed at latest revision, 19 unresolved discussions, some commit checks failed.


pkg/cli/init.go, line 33 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

The --join requirements needed for init to work should be included in the description here.

ping


pkg/server/init.go, line 95 at r11 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Changed to "bootstrap called after initialization complete". Better suggestions welcome.

I might tack a "cluster has already been initialized" on the end, but this works.


pkg/server/init.go, line 78 at r12 (raw file):

	}
	s.mu.Lock()
	s.mu.awaitDone = false

s/false/true/


Comments from Reviewable

@bdarnell bdarnell force-pushed the 5974_revisit_cluster_init branch from 6dce56a to 86d5e41 Compare July 26, 2017 21:47
@bdarnell
Copy link
Contributor

Review status: 14 of 16 files reviewed at latest revision, 19 unresolved discussions, some commit checks failed.


pkg/cli/init.go, line 33 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

ping

Done


pkg/server/init.go, line 95 at r11 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I might tack a "cluster has already been initialized" on the end, but this works.

Done. Note that there's probably a lot more work to be done on error messages here. This error is only returned on a very narrow race condition; if init is called on a node that is fully initialized and transitioned to serving mode, the error that comes back is something like a 404 because the init server is not even running.


pkg/server/init.go, line 78 at r12 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

s/false/true/

Oops. Done


Comments from Reviewable

adamgee and others added 3 commits July 26, 2017 21:51
--join flag is passed.

Use cmux to initially swallow all connections into the `initL` which is
served by the initServer.  Only the Init grpc interface is registered
so all other connections will get automatically closed.  Once cockroach
is ready to serve, flip the initL matcher to match nothing, effectively
removing the initServer.

Add `init` cli command.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants